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Abstract 

Advancements in both land surface models (LSM) and land surface data assimilation, especially 


over the last decade, have substantially advanced the ability of land data assimilation systems 
(LDAS) to estimate evapotranspiration (ET). This article provides a historical perspective on 
international LSM intercomparison efforts and the development of LDAS systems, both of which 
have improved LSM ET skill. In addition, an assessment of ET estimates for current LDAS 
systems is provided along with current research that demonstrates improvement in LSM ET 
estimates due to assimilating satellite-based soil moisture products. Using the Ensemble Kalman 
Filter in the Land Information System, we assimilate both NASA and Land Parameter Retrieval 
Model (LPRM) soil moisture products into the Noah LSM Version 3.2 with the North American 
LDAS phase 2 (NLDAS-2) forcing to mimic the NLDAS-2 configuration. Through comparisons 
with two global reference ET products, one based on interpolated flux tower data and one from a 
new satellite ET algorithm, over the NLDAS2 domain, we demonstrate improvement in ET 
estimates only when assimilating the LPRM soil moisture product. 
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1 INTRODUCTION 

2 Land surface models predict terrestrial water, energy, momentum, and in some cases, 

3 biogeochemical exchange processes by solving the governing equations of the soil-vegetation- 

4 snowpack medium based on atmospheric boundary conditions including precipitation, radiation, 

5 wind, temperature, humidity and pressure. By constraining land surface models with observed 

6 atmospheric boundary conditions and land surface states, land surface data assimilation improves 

7 our ability to understand and predict terrestrial water and energy fluxes and states, including 

8 evapotranspiration. The ability to predict evapotranspiration is critical for applications in weather 

9 and climate prediction, agricultural forecasting, water resources management, and hazard 

10 mitigation (e.g., NRC, 2010; 2011). Until recently, global or continental land surface modeling at 

1 1 horizontal scales of 1 km or finer was infeasible due to limits in computational and observational 

12 resources. 

13 

14 Land Data Assimilation Systems (LDAS, Figure 1), are typically run “uncoupled” (or 

15 “offline”) to estimate water and energy fluxes and states using observationally-based 

16 precipitation, radiation and meteorological inputs. However, they may also be run “coupled” to 

17 an atmospheric model for weather forecasts. 

18 

19 This paper reviews current and developing capabilities for estimating evapotranspiration 

20 (ET) using land surface models (LSMs) as part of an LDAS. First, we present a survey of land 

2 1 surface modeling for ET estimation, including recent intercomparison studies and LDAS efforts. 

22 Next, we compare LSM ET estimates from current LDAS systems to global gridded tower-based 

23 and remote sensing-based flux estimates. Finally, we present the results from simulations that 
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24 employ data assimilation (DA) of remotely sensed soil moisture measurements to improve LSM 

25 ET estimates. 

26 

27 BACKGROUND 

28 The ability of LSMs or soil-vegetation-atmosphere transfer schemes (SVATS) to predict 

29 evapotranspiration has advanced significantly since the original bucket (Manabe, 1969), Simple 

30 Biosphere (SiB; Sellers et ah, 1986) and Biosphere -Atmosphere Transfer Scheme (BATS; 

31 Dickinson et ah, 1986) models pioneered at the National Oceanic and Atmospheric 

32 Administration’s Geophysical Fluid Dynamics Laboratory (NOAA/GFDL), National 

33 Aeronautics and Space Administration’s Goddard Space Flight Center (NASA/GSFC) and the 

34 National Center for Atmospheric Research (NCAR), respectively. Numerous advancements in 

35 second-generation LSMs have brought additional focus to snow physics and hydrology, such as 

36 the community Noah (Ek et ah, 2003; Barlage et ah, 2010; Livneh et ah, 2010) and the Variable 

37 Infiltration Capacity (VIC; Liang et al., 1996; Bowling and Lettenmaier, 2010). So-called “third 

38 generation” LSMs include dynamic phenology and carbon stores, such as the Community Land 

39 Model (CLM; Bonan et al., 2002; Lawrence et al., 2011). To a large extent, this advancement has 

40 come as the result of three key community activities: first, the Global Land Atmosphere System 

41 Study (GLASS) intercomparison studies, second, the North American and Global LDAS 

42 projects, and third, the recent LandFlux initiatives. Below, we provide background on these 

43 efforts, including their major findings related to evapotranspiration estimation from LSMs. 

44 

45 GLASS Inter comparison Studies: PILPS, Rhone-AGG, and GSWP 

46 The international GLASS panel, as part of the Global Energy and Water Cycle Experiment 

47 (GEWEX), has spearheaded three major intercomparison projects designed to evaluate the skill 
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48 


of land surface models for predicting water and energy fluxes and states. The first was the 

49 Project to Intercompare Land-Surface Parameterization Schemes (PILPS; Henderson- Sellers et 

50 ah, 1995; Pitman and Henderson-Sellers 1998). This project focused on a series of evaluations 

5 1 conducted in phases using specified atmospheric boundary conditions and parameters at points or 

52 regions. One of the key findings in this project is the documentation of the systematic 

53 improvements in LSMs from first-generation (bucket) to second-generation (e.g., SiB, BATS, 

54 Noah, VIC) through third generation (e.g., CLM). Another major finding from PILPS is the 

55 synthesis work by Koster and Milly (1997), in which it was shown that the interplay between the 

56 evaporation and runoff fonnulations in any LSM, could be expressed via two model-independent 

57 quantities: 1) the soil-depth-integrated evaporation sink efficiency and 2) the runoff-generation 

58 fraction over this integrated evaporation sink. A third major finding was that hydrologically- 

59 oriented models such as VIC were shown to be more skillful for continental scale water budgets. 

60 

61 The second major GLASS intercomparison projects, which represented a global-scale 

62 follow-on to PILPS, were the Global Soil Wetness Projects (GSWPs, Dirmeyer, 2011). GSWP- 

63 1 (Dirmeyer et. al, 1999) focused on the International Satellite Land-Surface Climatology Project 

64 (ISLSCP) Initiative I forcing data for the period 1987-88, and produced the first ever global, 

65 offline, multimodel land analysis based on “best possible” meteorological forcings. In addition, 

66 GSWP-1 served as a pathfinder for the NLDAS and GLDAS efforts described below. GSWP-2 

67 (Dirmeyer et al., 2006) built on the foundation of GSWP-1, and produced 1 degree global multi- 

68 model fluxes and states for the ISLSCP II period from 1986-95 and showed that, in the absence 

69 of robust in-situ and/or remotely sensed soil moisture to provide constraints, the best estimate of 

70 soil wetness from multiple model products is a simple average. In addition to this finding, 
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71 GSWP-2 also derived multi-model soil wetness values normalized to the LSM dynamic range 

72 controlling the ET-runoff interplay, as described above in the PILPS analyses by Koster and 

73 Milly (1997). With respect to ET, GSWP-2 showed that ET has the smallest interannual 

74 variability of any water budget variable, and that global average transpiration is about one-third 

75 larger than direct evaporation from the soil. For the GSWP-2 period, latent heat flux exceeded 

76 sensible heat flux by about 20%, although that may reflect an absence of soil moisture limitations 

77 in later periods as observed by Jung et al. (2010) and discussed further below. 

78 The third major intercomparison project, which occurred between GSWP-1 and GSWP-2, is 

79 known as Rhone-AGG (Boone et al., 2004). Rhone-AGG significantly advanced the 

80 community’s ability to observationally diagnose deficiencies in LSM hydrological cycles by 

81 looking at spatial scaling of water and energy balance processes finer than GSWP (8km vs. 1°), 

82 particularly the interplay of high-elevation snow accumulation/melt and lower-elevation 

83 streamflow. Techniques for evaluating and diagnosing physical processes with simulated 

84 hydrographs helped advance LSM’s ability to simulate the daily hydrological cycles at multiple 

85 scales, most notably by implementing subgrid runoff formulations and elevation-based tiling for 

86 snow pack modeling. Rhone-AGG and GSWP-2 occurred in parallel with and greatly benefitted 

87 the development of NLDAS and GLDAS, as described in the following sections. 

88 

89 The North American Land Data Assimilation System 

90 The primary goal of the North American Land Data Assimilation System (NLDAS; Mitchell 

91 et al., 2004; http://ldas.gsfc.nasa.gov/nldas/; http://www.emc.ncep.noaa.gov/mmb/nldas/) is to 

92 construct quality-controlled, and spatially- and temporally-consistent, land-surface model (LSM) 

93 datasets from the best available observations and model outputs. NLDAS is a collaboration 

94 project among several groups: NOAA/NCEP's Environmental Modeling Center (EMC), NASA's 
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95 Goddard Space Flight Center (GSFC), Princeton University, the University of Washington, the 

96 NOAA/NWS Office of Hydrological Development (OHD), and the NOAA/NCEP Climate 

97 Prediction Center (CPC). The NLDAS project produces a LSM forcing dataset from a daily 

98 gauge-based precipitation analysis (temporally disaggregated using hourly radar data, satellite 

99 estimates, or other sources), bias-corrected shortwave radiation, and surface meteorology 

100 reanalyses. This forcing is used to drive four separate LSMs to generate hourly model outputs of 

101 surface fluxes, soil moisture, snow cover, and runoff. The current operational version of 

102 NLDAS uses the following LSMs: Noah - from NOAA/NCEP, Mosaic - from GSFC, VIC - 

103 from Princeton University, and SAC - from NOAA/OHD. Datasets and simulations from 

104 NLDAS Phase 2 (NLDAS-2) extend back to January 1979 and continue to be produced in near 

105 real-time on a l/8th-degree grid over central North America (from 25 to 53N and 125 to 67 W). 

106 NLDAS individual and ensemble-mean LSMs are also used for drought monitoring and as part 

107 of an experimental drought forecast system. The ensemble-mean on the drought monitor is a 

108 simple type of a multi -model analysis of LSMs, which have been shown to improve the depiction 

109 of simulated states in many ways (e.g., Guo et al., 2007, for GSWP-2 datasets). NLDAS data 

110 products are distributed at EMC as well as at the NASA Goddard Earth Sciences Data and 

1 1 1 Information Services Center (GES DISC; http://disc.gsfc.nasa.gov/hydrology/). 

112 

113 The first incarnation of the project, Phase 1 (NLDAS- 1), comprised data since October 1996 

114 and consisted of a somewhat-similar yet different LSM forcing dataset (Cosgrove et al., 2003). 

115 Earlier versions of the same four LSMs (Noah, SAC, VIC, and Mosaic) were used in NLDAS- 1 

116 as well. NLDAS- 1 datasets were extensively evaluated and validated against available 

117 observations in numerous studies, including examinations of the forcing (Luo et al., 2003) and of 
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118 the LSM output (Robock et al., 2003). Robock et al. evaluated the LSM-simulated soil 

119 moistures and temperatures against observations from the Oklahoma Mesonet, and also 

120 evaluated surface latent, sensible, and ground fluxes using ARM/CART stations. They found 

121 that the Noah LSM was closest to the observations of the latent heat flux in this region over a 

122 two-year period. Lohmann et al. (2003) intercompared water balance and streamflow between 

123 the LSMs and found regional differences up to a factor of 4 in the simulated mean annual runoff 

124 and up to a factor of 2 in the mean annual evapotranspiration, with monthly differences even 

125 greater. Other land parameters evaluated from NLDAS-1 included soil moisture (Schaake et al., 

126 2004), snow cover extent (Sheffield et al., 2003), and snow water equivalent (Pan et al., 2003). 

127 Many of these studies from NLDAS-1 also tested the effects of LSM physics and parameter 

128 changes on the evaluation results. 

129 

130 The NLDAS-2 forcing dataset corrects the daily gauge precipitation analysis using a PRISM 

131 (Parameter-elevation Regressions on Independent Slopes; Daly et al., 1994) method which 

132 considers the topographic effect on precipitation. The precipitation is temporally disaggregated 

133 to hourly, primarily using Stage II radar data. In locations/times when the radar data is not 

134 available, satellite retrievals, a coarser-scale hourly gauge analysis, or reanalysis data is used. 

135 The non-precipitation land-surface forcing fields for NLDAS-2 are derived from the analysis 

136 fields of the NCEP North American Regional Reanalysis (NARR, Mesinger et al., 2006). 

137 Surface pressure, surface downward longwave radiation, and near-surface temperature and 

138 humidity fields are vertically adjusted to the terrain on the NLDAS grid. The surface downward 

139 shortwave radiation is bias-corrected using GOES satellite observations. NLDAS-2 also 

140 contains numerous improvements to the equations of the LSMs as well as their calibration. The 
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141 snow physics in the Noah LSM was upgraded (Livneh et ah, 2010), VIC model parameters were 

142 calibrated using streamflow observations (Troy et ah, 2008), SAC used an updated potential 

143 evaporation dataset (Xia et ah, 2011b), and Mosaic used updated model parameters (for details, 

144 see Robock et ah, 2003). Xia et al. (2011a) analyzed water and energy fluxes in the upgraded 

145 NLDAS-2 LSMs, including their ensemble-mean and model spread. In a separate study, Xia et 

146 al. (2011c) examined the spatial distribution of the correlation between monthly-mean 

147 precipitation and evapotranspiration (ET) the four LSMs; they found that the two soil vegetation 

148 atmosphere transfer (SVAT) LSMs (Noah and Mosaic) had a stronger correlation, while the two 

149 hydrological LSMs (VIC and SAC) had a stronger correlation between the precipitation and 

150 runoff. Wei et al. (2011) evaluated improvements of the Noah LSM related to wann season 

151 simulation in NLDAS by adding a seasonally- and spatially-varying LAI as well as 

152 modifications to Noah’s treatment of the vertical profile of root density, the minimum stomatal 

153 resistance parameters, the diurnal variation of surface albedo, the roughness length for heat, and 

154 the vapor-pressure and soil moisture deficit terms. This study compared the NLDAS-2 version 

155 of Noah to ARM/CART latent heat flux observations and found reduced biases, which also 

156 helped improve the simulation of the mean annual water balance. Mo et al. (2011) compared ET 

157 from three NLDAS-2 LSMs against Ameriflux observations and found that Noah and VIC 

158 tended to exhibit low ET biases in the winter, with slight high ET biases in the summer, despite 

159 no apparent biases in NLDAS-2 net radiation. Mosaic generally had higher ET than the 

160 observations as well as from Noah and VIC, with a three-member ensemble-mean performing 

161 the best,, consistent with the findings of GSWP discussed in the previous section. Kovalskyy et 

162 al. (2011) estimated evapotranspiration using a scheme that combines a water balance model 

163 with an event-driven phenology model, driven with NLDAS-2 forcing; they compared these 
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164 estimates against ET from the MODerate Resolution Imaging Spectroradiometer (MODIS) 

165 instrument (Mu et al., 2007) as well as from the NLDAS-2 Mosaic LSM, and showed better 

166 agreement to the MODIS-derived ET at the 5-km scale of the study. 

167 

168 The Global Land Data Assimilation System 

169 The Global Land Data Assimilation System (GLDAS) led at NASA/GSFC (Rodell et al., 

170 2004a) also uses satellite- and ground-based observations to construct a forcing dataset to drive 

171 four LSMs. The four LSMs in GLDAS are Noah, Mosaic, VIC, and CLM, and GLDAS data 

172 extends globally from January 1979 at both 1.0-degrees (all LSMs) and 0.25-degrees (Noah 

173 only). In addition to extending an NLDAS-style framework to the global scale, GLDAS was one 

174 of the first LDASs to routinely assimilate satellite-based surface states to improve simulated 

175 water and energy fluxes and states. GLDAS has included data assimilation of MODIS snow 

176 cover to constrain the modeled SWE (after Rodell and Houser, 2004), and has also studied the 

177 effects of assimilating remotely-sensed skin temperatures and soil moistures. While considering 

178 ET produced by GLDAS, Rodell et al. (2004b) compared basin-scale estimates of 

179 evapotranspiration produced by GLDAS/Noah and other models against a water balance 

180 approach using the Gravity Recovery And Climate Experiment (GRACE) satellites, and found 

181 that the GRACE estimates were generally within the range of the model results, and the biases 

182 were consistent and the uncertainty on the same order as GRACE.. Kato et al. (2007) examined 

183 the choice of LSM, land cover, soils, elevation, and forcings using GLDAS on the simulated 

184 latent and sensible heat fluxes and soil moisture compared to CEOP in situ observations. They 

185 found that the LSM choice had the biggest effect on the simulated output (including ET), and 

186 that ET was most sensitive first to precipitation, then land cover, and then radiation. Syed et al. 

187 (2008) compared variations in terrestrial water storage from GRACE compared to GLDAS 
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188 simulations and found that ET was most effective in dissipating terrestrial water storage in the 

189 mid-latitudes. Despite all these detailed studies, however, none have directly evaluated both 

190 GLDAS and NLDAS using observations over CONUS. One study that did compare early 

191 versions of both systems (Jambor et al.,2002) demonstrated the benefit of satellite-based 

192 precipitation used in conjunction with model precipitation in GLDAS while using the gauge- 

193 based precipitation in NLDAS as the evaluation dataset. Overall, GLDAS ’s advancements in 

194 land data assimilation and GRACE-based ET estimation significantly advanced the ability of 

195 LSMs to estimate ET, subject to observational constraints. 

196 

197 The LandFlux Initiative and Reference ET Products 

198 The LandFlux initiative has been coordinated by the GEWEX Radiation Panel to develop 

199 and evaluate consistent and high-quality global ET datasets for climate studies. Recently 

200 developed capabilities for global ET estimation using LSMs (as discussed above) as well as 

201 techniques for synthesizing satellite data, flux tower data, and atmospheric reanalyses provide 

202 the opportunity to produce global ET products using different approaches. The LandFlux-EVAL 

203 project (Mueller et ah, 2011; Jimenez et ah, 2011) is currently evaluating multiple global ET 

204 products produced using four different categories of techniques: 1) Observations-based 

205 diagnostic datasets; 2) observationally-driven “offline” LSM products (e.g., GSWP, GLDAS); 3) 

206 atmospheric reanalyses; and 4) IPCC AR4 simulations from 11 GCMs. In Mueller et ah, 41 

207 global land ET datasets were evaluated along with IPCC AR4 GCM simulations for the 1989— 

208 1995 time period. An interesting finding of a cluster analysis conducted as part of this study is 

209 that the GLDAS-Noah and CLM products were closely related to two different reference ET 

210 products, including the Jung et ah, 2009 product described further below. 

211 


11 



212 Jimenez et al. (2011) evaluated 12 monthly mean land surface ET and other flux products for 

213 the period 1993-1995 and found that the 12-product global annual mean latent heat flux (Qle) 

214 was approximately 45 Wm with a spread of approximately 20 Wnf . Similar spreads were 

215 found for sensible (Qh) and net radiative (Rn) fluxes, with larger spreads for tropical rainforest 

216 year-round and grassland or crop in the dry season. Analysis for large river basins indicated 

217 large spreads for the Danube, Congo, Volga, and Nile basins, with smaller spreads for other 

2 1 8 basins, including the Mississippi. 

219 

220 One of the key reference datasets for the LandFlux-EVAL effort, also used as one of the 

221 reference datasets in our LDAS analysis described in subsequent sections, is the Max Planck 

222 Institute (MPI) flux dataset from Jung et al., (2009), which was created by synthesizing 

223 FLUXNET (Baldocchi et al., 2001) tower data with meteorological forcings and vegetation 

224 information from interpolated station and satellite data to produce a global, monthly, 1/2 degree 

225 resolution estimate of land ET from 1982 to 2008. Jung et al. (2010) found that global annual 

226 ET has been increasing by approximately 7 mm per year per decade during the period 1982- 

227 1997, with moisture limitation eliminating this trend during the period 1998-2008. Another ET 

228 product used as a reference dataset in this study is the global 1km ET estimates based on MODIS 

229 satellite data (Mu et al., 2011). In this dataset, ET estimates are derived using Mu et al. (201 l)’s 

230 algorithm, which is improved relative to the previous Mu et al. (2007) work. The ET algorithm is 

23 1 primarily based on the Penman-Monteith equation and considers the surface energy partitioning 

232 and environmental constraints to derive ET. In this study, we employ the monthly averaged 

233 MOD 1 6 ET datasets. 

234 
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235 Although the LandFlux-EVAL effort has compared model-based GLDAS and GSWP flux 

236 estimates to observationally-based MOD 16 and MPI reference flux estimates, a key question not 

237 previously addressed by LandFlux-EVAL is the extent to which assimilating observed soil 

238 moisture can reduce the differences between the model-based and observationally-based flux 

239 estimates. Addressing this question is one of the primary motivations of the current work. 

240 

241 EXPERIMENTAL SETUP 

242 To illustrate the current capability of the current LDAS systems to simulate 

243 evapotranspiration at continental scales, we compare estimates from GLDAS, and two NLDAS- 

244 equivalent simulations over the NLDAS-domain with two reference datasets: (1) the gridded 

245 FLUXNET dataset from Jung et al. (2009) and (2) the MOD 16 dataset developed by Mu et al. 

246 (2011). Further, we also present estimates from the NLDAS-equivalent simulations that employ 

247 the assimilation of satellite surface soil moisture retrievals. Because the NLDAS uses only a 

248 single version of the Noah LSM, we chose to produce our NLDAS-equivalent products using the 

249 Land Information System (LIS; Kumar et al., 2006, Peters-Lidard et al., 2007) with Noah 

250 versions 2.7.1 and 3.2 so that we can examine the impacts of recent physics changes in Noah on 

251 ET estimation. The experiments employ the same domain configuration used in the NLDAS 

252 project (from 25-53°N and 125-67°W at l/8 th -degree resolution) and are designed in a manner as 

253 similar as possible to the NLDAS-2 Noah model simulations. While the forcings and NLDAS- 

254 equivalent simulations are at a 1/8 degree horizontal resolution, we average the outputs to Vi 

255 degree resolution prior to comparisons with the FLUXNET and MOD 16 datasets. The GLDAS 

256 Noah datasets are similarly averaged from '//-degree to the Vi degree resolution. The forcing for 

257 the NLDAS-equivalent runs is the NLDAS-2 described above. The simulations are run with a 15 
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258 minute timestep, and the models are spun up by running from 1979 to 1985 and then 

259 reinitializing the model from 1979 to generate outputs from 1979-2010. 

260 

261 In the data assimilation integrations, we employ surface soil moisture data derived from the 

262 Advanced Microwave Scanning Radiometer for the Earth Observing System (AMSR-E) sensor 

263 aboard the Aqua satellite. Two different AMSR-E retrieval products are employed in the data 

264 assimilation simulations; (1) the NASA Level-3, “AE_Land3” product (version 6, Njoku et ah, 

265 2003) and (2) AMSR-E Land Parameter Retrieval Model (LPRM) product developed at NASA 

266 GSFC and VU Amsterdam (Owe et ah, 2008). The NASA product is primarily based on X-band 

267 brightness temperatures, whereas both X-band and C-band brightness temperature-based 

268 retrievals are used in the LPRM product. Measurements from both ascending and descending 

269 overpasses are used in these products. A number of quality control measures are applied to the 

270 soil moisture retrievals prior to data assimilation, similar to the approaches followed in Reichle 

271 et al. (2007) and Liu et al. (2011). In the soil moisture products, retrievals flagged for dense 

272 vegetation, precipitation, snow cover, frozen ground, and Radio Frequency Interference (RFI) 

273 are excluded in the assimilation system. Further, additional quality control is applied based on 

274 the infonnation from the land surface model, where the retrievals are excluded when the land 

275 surface model indicated active precipitation, non-zero snow cover, frozen soil or dense 

276 vegetation (when green vegetation fraction > 0.7). 

277 

278 The assimilation integrations employ a one-dimensional Ensemble Kalman Filter (EnKF) 

279 algorithm, which is a widely used technique for soil moisture data assimilation (Reichle et al., 

280 2002, Crow and Wood, 2003, Reichle et al., 2007, Kumar et al., 2008, Kumar et al., 2009). An 
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281 ensemble size of 12 is used in these simulations (Kumar et al., 2008), with perturbations applied 

282 to both the meteorological fields and model prognostic fields to simulate uncertainty in the soil 

283 moisture fields. The parameters used for these perturbations are listed in Table 1, which are 

284 based on earlier data assimilation studies (Kumar et ah, 2009). As algorithms such as EnKF are 

285 designed to correct random, zero-mean errors and assume the use of unbiased observations 

286 relative to the model generated background, it is often a common practice to scale the 

287 observations prior to data assimilation to match the model’s climatology (Reichle and Koster, 

288 2004, Drusch et al., 2005, Reichle et al., 2007, Kumar et al., 2009). Here we employ the 

289 Cumulative Distribution Function (CDF)-scaling approach of Reichle and Koster (2004), where 

290 the observations (roughly corresponding to a maximum depth of 2cm) are rescaled to the 

291 model’s 10cm surface soil moisture climatology by matching the CDF of the observations to the 

292 CDF of the model soil moisture. The model CDF and observation CDF are computed using 7 

293 years of data (2002-2008), separately for each grid point. 

294 

295 As the soil moisture retrievals are available only from 2002 onwards, the NFDAS-equivalent 

296 simulations with data assimilation are conducted during the period of 2002-2008. During this 

297 period, we update not only the surface (10cm) soil moisture in Noah, but also the layer 2 through 

298 layer 4 soil moisture, following the parameters in Table 1. The comparisons presented in next 

299 section are limited to the data assimilation period (2002-2008). 

300 

301 CURRENT RESEARCH FINDINGS AND FUTURE WORK 

302 The results presented in this section focus first on the evaluation of the FDAS ET estimates 

303 that do not employ data assimilation. This is followed by the description of the impact of soil 

304 moisture data assimilation on ET estimation. 
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305 

306 Evaluation of the ET estimates from LDAS simulations 

307 Table 2 presents the domain-averaged root mean square and bias errors and the associated 95% 

308 confidence intervals, for latent (Qle) and sensible (Qh) heat flux estimates from the three LDAS 

309 simulations compared against the gridded FLUXNET and MOD 16 datasets. This table also 

310 shows results from the data assimilation experiments to be discussed in the next section. 

311 Overall, the NLDAS-like simulation using the Noah 2.7. 1 model provides better estimates of Qle 

312 (RMSE of 19.3 Wm" 2 against FLUXNET and 21.5 Wm’ 2 against MOD16) relative to other 

313 products. Average seasonal cycles of these error metrics stratified monthly are presented in 

314 Figure 2. In both sets of comparisons, the largest differences between the LDAS simulations are 

315 observed during the spring and fall months, with NLDAS-like simulations with Noah 2.7.1 

316 providing the better estimates. Qle estimates from GLDAS show underestimation in the late 

317 summer and fall months and an overestimation in the spring and early summer months, relative 

318 to both reference datasets. Comparatively, NLDAS-like integration with Noah 2.7.1 indicates 

319 lower biases most months, but the biases are consistently positive. Though Noah 3.2 is a newer 

320 version of the Noah model, the flux estimates appear to be degraded overall relative to Noah 

321 2.7.1, with the comparison against FLUXNET data indicating more severe degradations relative 

322 to the MOD 16. This may reflect uncertainty in the reference flux datasets during the springtime, 

323 in particular. During the fall months, bias estimates in Noah 3.2 are improved relative to Noah 

324 2.7.1 (in the comparisons against FLUXNET), but during spring and summer months, biases in 

325 Noah 3.2 ET increase compared to that of Noah 2.7.1. It can be noted that these trends in RMSE 

326 and bias errors are highly statistically significant, as indicated by the 95% confidence interval 

327 values given for each error estimate. Note that any spatial auto-correlation of RMSE and Bias 

328 values across the domain is ignored in computing these confidence intervals. The tight intervals 
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329 reported in Table 2 are likely to increase if allowances for spatial autocorrelation of errors are 

330 included in the confidence interval computations. The trend of increased flux error estimates in 

331 Noah 3.2 relative to Noah 2.7.1 is likely a result of the changes in model parameters (such as 

332 LAI) along with other changes to Noah’s warm season physics as described in Wei et al. (2011). 

333 As discussed in the Background section, Wei et al. (2011) showed improvement of these physics 

334 changes when compared to ARM/CART flux datasets, whereas our analysis uses gridded data 

335 (FLUXNET and MOD 16) over the entire NLDAS domain, including many different vegetation 

336 types and climate regimes. Other studies (also presented in the Background section) showed 

337 Noah 3.2’s improved simulation of streamflow, snow, and other hydrologic variables relative to 

338 Noah 2.7.1. 

339 

340 Figure 3 provides an intercomparison of the seasonally-averaged Qle computed using 

341 estimates from three consecutive months (DJF represents December- January-February, MAM 

342 represents March-April-May, JJA represents June- July-August and SON represents September- 

343 October-November) from the three LDAS integrations and the two reference datasets, the 

344 gridded FLUXNET and MOD 16 product. Relative to FLUXNET, all three LDAS datasets show 

345 higher ET in the spring MAM months in the Southeast and Lower Mississippi River Basin. 

346 During JJA, however, GLDAS compares much better than the NLDAS-like Noah simulations. 

347 Interestingly, the Noah 2.7.1 and Noah 3.2 results for JJA are generally similar except over 

348 highly vegetated crop regions in the upper Midwest and the irrigated growing areas along the 

349 Mississippi River. Neither reference dataset seems to reflect irrigated areas (see, for example 

350 Table 4 in Mu et al., 2011), and the differences in Noah2.7.1 and Noah3.2 do not include 

351 irrigation effects, so these differences for crops are likely due to changes in the aerodynamic 
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352 conductance formulation in Noah, implemented primarily for improved snowmelt modeling. 

353 Here, the Noah 3.2 JJA Qle is much higher, indicating that the parameter values used for the 

354 vegetation type(s) in these regions may need refinement. A closer look during MAM also shows 

355 this same pattern with higher Qle in Noah 3.2 relative to Noah 2.7.1. It can be noted that except 

356 MOD16, all other datasets show an artifact of lower Qle in California and West Coast regions. 

357 Interestingly, the higher Qle areas for MOD16 do not seem to correspond to known irrigated 

358 areas. During the fall SON months, the Qle from GLDAS is low compared to FLUXNET, 

359 particularly over the Upper Plains and Southeast; the Noah simulations show a pattern overall 

360 much closer to FLUXNET, but with too high Qle magnitudes right along the Gulf Coast. 

361 MOD 16, on the other hand, indicates lower Qle over the High plains consistent with GLDAS, 

362 and higher Qle over the Southeast, consistent with the NLDAS-based estimates. 

363 

364 Impact of soil moisture data assimilation on ET estimates 

365 Figure 4 provides a comparison of the average seasonal cycles of RMSE and Bias in Qle 

366 estimates (again, relative to the two reference datasets) from the NLDAS-like simulation without 

367 any data assimilation (tenned as the "open loop" (OL) simulation) and the two integrations that 

368 employ the assimilation of surface soil moisture retrievals from NASA and LPRM products 

369 (NASA-DA and LPRM-DA, respectively). In this comparison, all three model integrations 

370 employ Noah 3.2. It can first be observed that the assimilation of soil moisture retrievals impact 

371 the Qle estimates, primarily during the summer and fall months. The Qle estimates from the open 

372 loop simulation are systematically improved by the assimilation of LPRM soil moisture 

373 retrievals, whereas the assimilation of NASA retrievals shows degradation. Compared to 

374 FLUXNET, the domain averaged RMSE of the open loop integration is 27.6 Wm" 2 and it 

375 increases to 29.4 Win'” in the NASA-DA integration (Table 2). The improvements shown in 


18 



2 

376 Figure 4 from LPRM-DA translates to a domain averaged RMSE of 25.6 Win' when compared 

377 to FLUXNET. The trends in RMSE and Bias are similar in the comparisons against MOD 16. 

378 The RMSE of the open loop integration is 22.7 Win" 2 and it improves to 21.9 Wm" 2 with the 

379 assimilation of LPRM soil moisture retrievals. The assimilation of NASA soil moisture retrievals 

380 degrades the ET estimates, with a domain averaged RMSE of 24.5 Wm - '. 

381 

382 To quantify the spatial improvements due to assimilation, we define an "improvement 

383 metric" as difference between the RMSE of the integration with data assimilation and the RMSE 

384 of the open loop integration (RMSE (DA) - RMSE (OL)). If data assimilation improves the flux 

385 estimates (i.e., reduces the RMSE), then the improvement metric will be negative. On the other 

386 hand, the improvement metric will be positive if the assimilation simulation degrades the flux 

387 estimates. Figures 5 and 6 present a comparison of the improvement metric stratified seasonally, 

388 from both assimilation integrations, as compared to both reference ET datasets. Figure 5 

389 represents the improvement metric when using the NASA AMSR-E product and Figure 6 shows 

390 the corresponding comparisons when using the LPRM product. In both sets of comparisons, the 

391 LPRM-based assimilation provides more systematic improvements in the flux estimates, whereas 

392 the NASA-based integration indicates degradations over several regions. For example, during 

393 MAM months, the flux estimates from NASA-DA show degradation over the Southern Great 

394 Plains, with improvements observed over Illinois, Indiana, Ohio and areas along the Mississippi 

395 river; these are the same areas discussed earlier in Figure 3 where Noah 3.2 had higher simulated 

396 ET. The LPRM-integration on the other hand, shows improvements over large areas of 

397 Midwest, and South-central U.S during MAM, with no significant degradations observed as a 

398 result of soil moisture assimilation. During JJA, the NASA-DA shows degradations in most 
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399 regions interspersed with improvements over a few regions near North Dakota, Illinois, Eastern 

400 Texas and the West coast, in the comparisons using the FLUXNET data. The MOD16-based 

401 comparisons show similar results, with small regions of improvements over the Central and 

402 Eastern US with degradations over most of the domain. In contrast, the JJA comparisons for 

403 LPRM-DA show degradations in a few regions only, with improvements observed over large 

404 areas of the Midwest U.S. During the SON months, similar trends are seen, with degradations 

405 over Mexico and regions near Ohio and Illinois (when compared to FLUXNET) and Mississippi 

406 river basin areas (when compared to MOD 16). The LPRM-DA based simulations show 

407 improvements over Midwest and Central US in the comparisons against MOD 16 and no 

408 significant degradations in the comparisons against FLUXNET. 

409 Further analysis of the differences shown in Figures 5 and 6 reveals that the magnitudes of 

410 the differences are strongly related to landcover type. Based on further analysis (not shown), 

411 stratifying the Qle RMSE improvements due to DA with respect to landcover type, we found that 

412 the most significant improvements occur in croplands for both soil moisture datasets and both 

413 reference datasets. Grassland was also found to have significant changes in Qle RMSE with both 

414 datasets, and more so with respect to the MOD 16 reference data. In general, DA does not occur 

415 over heavily vegetated regions, due to masking out high- vegetation water content areas which 

416 make soil moisture retrievals difficult. Nonetheless, our results suggest modest changes over 

417 evergreen needleleaf forests and woodlands, especially for the NASA product. 

418 

419 In order to relate the improvements and degradations in latent heat flux estimates to changes 

420 in soil moisture resulting from data assimilation, we present a comparison of the surface soil 

421 moisture difference maps in Figure 7. The difference maps represent the mean surface soil 
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422 moisture of the data assimilation integration subtracted by the mean surface soil moisture of the 

423 open loop integration. In other words, the difference maps represent the changes in soil moisture 

424 values introduced by data assimilation, averaged seasonally. A negative difference indicates that 

425 the soil moisture is drier due to assimilation and a positive difference indicates that the soil 

426 moisture is wetter from assimilation. By comparing Figures 5 and 6 against Figure 7, it can be 

427 observed that the spatial patterns of the improvement metric correlates well with those of the soil 

428 moisture difference maps. For example, during MAM, both the NASA-DA and the NASA- 

429 LPRM (with a smaller magnitude) soil moisture difference maps indicate drier patterns over 

430 Illinois, Indiana, Ohio and areas along the Mississippi river that leads to corresponding 

431 improvements in latent flux estimates (Figure 5) over these same regions. During JJA, the soil 

432 moisture changes due to assimilation of NASA and LPRM are generally of opposite sign, and 

433 seem to show a mix of improvements and degradations in the fluxes, depending on landcover. . 

434 During SON, assimilation of NASA retrievals dries the soil moisture over Lower Texas and 

435 Mexico (relative to OL) and it leads to a corresponding degradation in the flux estimates over 

436 these regions when compared to FLUXNET (Figure 5, panel for SON). Similar patterns of tight 

437 correlation between the soil moisture difference patterns and the flux improvement patterns can 

438 be observed in the LPRM-DA integration. During JJA, over the Midwest US, the LPRM-DA 

439 causes the soil moisture simulations to be drier than the open loop simulation leading to 

440 improvements in the latent heat flux estimates over the same areas. 

441 

442 It is important to note that the CDF-scaling approach for data assimilation used here (and 

443 described previously) is intended to preserve the soil moisture climatology of the LSM, while 

444 taking advantage of observed anomalies. Therefore, the results suggest that the 
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445 improvements/degradations in Qle due to soil moisture assimilation are a direct result of 

446 improved/degraded soil moisture stress responses in the stomatal resistance formulation of the 

447 Noah 3.2 LSM. This sort of non-linear feedback is likely due to the ability of LPRM-DA to 

448 redistribute water in a seasonal cycle that corresponds to Noah’s biases in soil moisture and ET. 

449 By design, and verified by us (not shown) the soil moisture increments from both NASA-DA and 

450 LPRM-DA do not change the mean surface soil moisture relative to the open loop. However, as 

451 most easily explained via the bias time series in Figure 4, significant seasonal changes in soil 

452 moisture, translate into varying ET responses. In winter, the change in ET from the OL to the 

453 DA is nominal (since Rnet, and therefore Qle is small). In spring, LPRM tends to reduce soil 

454 moisture relative to the open loop and therefore Qle bias while NASA-DA increases soil 

455 moisture and Qle bias. In the summer, the open loop skill is higher, and both products further 

456 compensate for errors, with LPRM tending to overdry and NASA tending too wet. Overall, the 

457 net effect is a domain-averaged increase of 2 Wm' in total Qle for the LPRM-DA, and a 3 Win' 

458 reduction in Qle for NASA-DA. 

459 

460 Evaluation of the surface energy partition from LDAS simulations 

461 The surface energy partition consists of two key components, the latent heat and the sensible 

462 heat fluxes. Though the primary focus of this article is to evaluate the latent heat estimates, we 

463 also evaluated the sensible heat flux (Qh) estimates from the LDAS simulations using 

464 FLUXNET data to examine if the trends seen in the Qle estimates are consistent for both energy 

465 partition terms. Trends in error metrics similar to those seen with latent heat flux estimates are 

466 found in the sensible heat estimates. As shown in Table 2, Qh estimates from GLDAS has a 

467 domain averaged RMSE of 23.4 Wm" 2 whereas the NLDAS-like simulations with Noah 2.7.1 

468 and Noah 3.2 have domain averaged RMSEs of 21.1 and 32.5 Wm' , respectively. The 
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469 assimilation with LPRM data improves the domain averaged RMSE to 30.4 Wm' (over that of 

470 the open loop integration NLDAS-like simulation with Noah 3.2), whereas the assimilation of 

471 NASA retrievals degrades the sensible heat flux estimates with a domain averaged RMSE of 

472 34.5 Wm' . Spatial patterns of improvements and degradations similar to that seen in Figures 5 

473 and 6 are observed for sensible heat fluxes as well (not shown; for FLUXNET only, as Qh is not 

474 available from MOD 16). Again, these trends are statistically significant, as seen from the 

475 confidence interval values presented in Table 2. 

476 

477 SUMMARY 

478 This article provides a description of the capabilities of Land Data Assimilation Systems (LDAS) 

479 for generating ET estimates and presents a quantitative evaluation of ET estimates, expressed as 

480 latent heat flux (Qle), from a number of LDAS simulations. The simulated ET values from 

48 1 GLDAS and LSM simulations conducted over the NLDAS-domain using two different versions 

482 of the Noah land surface model (Noah version 2.7.1 and version 3.2) are compared against two 

483 reference ET datasets: the gridded tower-based estimates from the FLUXNET measurements and 

484 ET estimates based on MODIS satellite data, known as the MOD 16 product. The article also 

485 presents an evaluation of the impact of soil moisture data assimilation in ET estimation. The data 

486 assimilation integrations employ two different retrievals of the AMSR-E soil moisture 

487 measurements; the NASA Level-3 product and the AMSR-E Land Parameter Retrieval Model 

488 product from VU Amsterdam. 

489 

490 The evaluation of ET fields indicate that the simulation using NLDAS forcing with Noah 

491 2.7.1 provides slightly better estimates among the LDAS simulations without data assimilation, 

492 although all Noah simulations suffer from significant high biases relative to the two reference 
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493 dataset. This could be due to a combination of parameter and structural errors in addition to 

494 errors in the reference data themselves. The three LDAS simulations differ most during the 

495 spring and fall months. Comparison of the seasonally averaged ET fluxes show overestimations 

496 during MAM in all three LDAS products over the Southeast and Lower Mississippi River basin. 

497 During JJA, the differences between the two Noah model-based simulations are more prominent 

498 over vegetated crop regions in the upper Midwest and irrigated areas along the Mississippi river. 

499 GLDAS product shows underestimation in ET during the SON months, whereas the NLDAS- 

500 forced Noah simulations show better agreement with both ELUXNET and MOD 16 estimates 

501 during this period. 

502 

503 The assimilation of surface soil moisture impacts the ET estimates, particularly during the spring 

504 (MAM) and summer (JJA) months, which is when the expected impacts would be largest due to 

505 increasing soil moisture stress, insolation, and vegetation fraction conditions. The assimilation of 

506 LPRM retrievals demonstrates systematic, statistically significant but modest improvements in 

507 ET estimates relative to the Noah model simulation without data assimilation. The assimilation 

508 of NASA retrievals, on the other hand, provides mixed results, with improvements in a few 

509 regions of the NLDAS-domain. Overall, the integration using the NASA soil moisture retrievals 

510 indicates degradation of the open loop ET estimates. The results also indicate strong correlations 

511 between the improvements/degradations of ET estimates and the changes in soil moisture fields 

512 introduced by soil moisture assimilation. Finally, the analysis of the sensible heat flux estimates 

513 indicates consistent trends in both surface energy partition terms (latent and sensible estimates). 

514 
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765 Table 1: Parameters for perturbations to meteorological forcings and soil moisture 

766 prognostic model variables in the data assimilation integrations. 


Variable 

Perturbation 

Type 

Standard 

Deviation 

Cross Correlations with 
perturbations in 

Meteorological Forcings 


Downward 

Shortwave 

Downward 

Longwave 

Precipitation 

Downward 

Shortwave 

Multiplicative 

0.3 [-] 

1.0 

-0.5 

-0.8 

Downward 

Longwave 

Additive 

50 Win 2 

-0.5 

1.0 

0.5 

Precipitation 

Multiplicative 

0.5 [-] 

-0.8 

0.5 

1.0 

Noah LSM soil moisture states 




Sml 

Sm2 

Sm3 

Sm4 

Total soil 

moisture 
layer 1 (sml) 

Additive 

0.6E-3 m 3 m" 

3 

1.0 

0.6 

0.4 

0.2 

Total soil 

moisture 
layer 2 (sm2) 

Additive 

1.1 E-4 mV 

3 

0.6 

1.0 

0.6 

0.4 

Total soil 

moisture 
layer 3 (sm3) 

Additive 

0.6E-5 m 3 m' 

3 

0.4 

0.6 

1.0 

0.6 

Total soil 

moisture 
layer 4 (sm4) 

Additive 

0.4E-5 nrW 

3 

0.2 

0.4 

0.6 

1.0 
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770 Table 2: NLDAS domain-averaged root mean square and bias errors (all with 95% 

771 confidence intervals) in latent heat flux (Qle) and sensible heat flux (Qh) estimates from 

772 five LDAS simulations with respect to two reference datasets : (1) the gridded FLUXNET 

773 data from Jung et al., (2009) and (2) MOD16 data from Mu et al. (2011), which only 

774 provides Qle. Two of the NLDAS simulations show differences due to Noah version, and 

775 two of the NLDAS simulations include soil moisture data assimilation from the NASA and 

776 LPRM products, as discussed in the text. 



FLUXNET 

MOD16 

Qle 

RMSE 

(Wm 2 ) 

Bias 

(Wm 2 ) 

RMSE 

(Wm 2 ) 

Bias 

(Win 2 ) 

GLDAS 

24.7 ± 0.3 

5.5 ±0.4 

28.0 ±0.2 

4.4 ±0.3 

NLDAS (Noah v2.7.1) 

19.3 ±0.3 

11.9 ±0.4 

21.5 ±0.2 

10.3 ±0.3 

NLDAS (Noah v3.2) 

27.6 ±0.3 

12.9 ±0.4 

22.7 ±0.2 

11.2 ±0.3 

NLDAS (Noah v3.2)+NASA DA 

29.4 ± 0.3 

15.9 ±0.4 

24.5 ±0.2 

14.2 ±0.3 

NLDAS (Noah v3.2)+LPRM DA 

25.6 ±0.3 

10.9 ±0.3 

21.9 ±0.2 

9.2 ±0.3 

Qh 





GLDAS 

23.4 ±0.2 

-5.6 ±0.4 

N/A 

N/A 

NLDAS (Noah v2.7.1) 

21.1 ±0.3 

-7.0 ± 0.4 

N/A 

N/A 

NLDAS (Noah v3.2) 

32.5 ±0.3 

-9.2 ± 0.4 

N/A 

N/A 

NLDAS (Noah v3.2)+NASA DA 

34.5 ±0.3 

-12.2 ±0.4 

N/A 

N/A 

NLDAS (Noah v3.2)+LPRM DA 

30.4 ±0.3 

-7.3 ± 0.4 

N/A 

N/A 
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855 Figure 6: Same as Figure 5, but from data assimilation integrations using LPRM AMSR-E 

856 soil moisture retrievals (LPRM-DA). 
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860 Figure 7: Comparison of the mean soil moisture difference maps (soil moisture (DA) - soil 

861 moisture (OL)) from soil moisture data assimilation integrations. The left and right panels 

862 represent the NASA-DA and LPRM-DA, respectively. All units are in m 3 m' 3 . 
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